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ABSTRACT 

Massive (stellar mass > 3 x lO'^M©), passively evolving galaxies at redshifts z > 1 exhibit on the average 
physical sizes smaller by factors w 3 than local early type galaxies (ETGs) endowed with the same stellar mass. 
Small sizes are in fact expected on theoretical grounds, if dissipative collapse occurs. Recent results show that 
the size evolution at z < 1 is limited to less than 40%, while most of the evolution occurs at z > 1, where both 
compact and already extended galaxies are observed and the scatter in size is remarkably larger than locally. 
The presence at high redshift of a significant number of ETGs with the same size as their local counterparts 
as well as of ETGs with quite small size (< 1/10 of the local one), points to a timescale to reach the new, 
expanded equilibrium configuration of less than the Hubble time tff(z). We demonstrate that the projected 
mass of compact, high redshift galaxies and that of local ETGs within the same physical radius, the nominal 
half-luminosity radius of high redshift ETGs, differ substantially, in that the high redshift ETGs are on the 
average significantly denser. This result suggests that the physical mechanism responsible for the size increase 
should also remove mass from central galaxy regions (r < 1 kpc). We propose that quasar activity, which peaks 
at redshift z ^ 2, can remove large amounts of gas from central galaxy regions on a timescale shorter than, or of 
order of the dynamical one, triggering a puffing up of the stellar component at constant stellar mass; in this case 
the size increase goes together with a decrease of the central mass. The size evolution is expected to parallel 
that of the quasars and the inverse hierarchy, or downsizing, seen in the quasar evolution is mirrored in the 
size evolution. Exploiting the virial theorem, we derive the relation between the stellar velocity dispersion of 
ETGs and the characteristic velocity of their hosting halos at the time of formation and collapse. By combining 
this relation with the halo formation rate at z ^ 1 we predict the local velocity dispersion distribution function. 
On comparing it to the observed one, we show that velocity dispersion evolution of massive ETGs is fully 
compatible with the observed average evolution in size at constant stellar mass. Less massive ETGs (with stellar 
masses ^ 3 x 10'"Mq) are expected to evolve less both in size and in velocity dispersion, because their 
evolution is ruled essentially by supernova feedback, which cannot yield winds as powerful as those triggered 
by quasars. The differential evolution is expected to leave imprints in the size vs. luminosity/mass, velocity 
dispersion vs. luminosity/mass, central black hole mass vs. velocity dispersion relationships, as observed in 
local ETGs. 

Subject headings: galaxies: formation - galaxies: evolution - galaxies: elliptical - galaxies: high redshift - 
quasars: general 



1. INTRODUCTION 

Most of the massive (stellar mass M* > 3 x lO'^M©), pas- 
sively evolving, galaxies at z > 1 observed with high enough 
angular resolution exhibit characteristic sizes of their stellar 
distributions much more compact than local early type galax- 
ies (ETGs) of analogous stellar mass (Ferguson et al. 2004; 
Trujillo et al. 2004, 2007; Longhetti et al. 2007; Toft et al. 
2007; Zirm et al. 2007; van der Wei et al. 2008; van Dokkum 
et al. 2008; Cimatti et al. 2008; Buitrago et al. 2008; Dam- 
janov et al. 2009). This very interesting property of mas- 
sive ETGs adds to others important features: (i) luminosity, 
half-luminosity (or effective) radius Re and velocity disper- 
sion (7 of ETGs fall in a narrow range around the so called 
Fundamental Plane (Djorgovski & Davis 1987; Dressier et al. 
1987); (ii) the color-magnitude (e.g., Visvanathan & Sandage 
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1977; Sandage & Visvanathan 1978; Bower et al. 1992a) and 
color-cr (Bower et al. 1992b; Bernardi et al. 2005) relations; 
(iii) the increasing a-enhancement with increasing mass (see 
the discussion by Thomas et al. 1999); (iv) the generic exis- 
tence of a supermassive black hole (BH) in their centers with 
mass M, w 10~''M* (Magorrian et al. 1998; see Ferrarese & 
Ford 2005 for a review). 

The first three properties imply that massive ETGs are old 
systems, formed at Zform > 1 .5 on a timescale shorter than 1 
Gyr; the environment plays a minor, but non-negligible, role, 
ETGs in lower density environments being only about 1-2 
Gyr younger (for a review see Renzini 2006). Such properties 
are extremely demanding for any scenario of galaxy forma- 
tion, in particular if one sticks to the hierarchy implied by the 
primordial power spectrum imprinted on dark matter (DM) 
perturbations. On the other hand, the physics of baryons (i.e., 
their cooling/heating mechanisms and related feedback pro- 
cesses) has to play a fundamental role in galaxy formation 
(e.g., Larson 1974a,b; White & Rees 1978). The baryon con- 
densation in cold gas and stars within galactic DM halos is 
the outcome of complex physical processes, including shock 
waves, radiative and shock heating, viscosity, radiative cool- 
ing. 
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In addition, the linear relationship between the central su- 
permassive BH and the stellar component of ETGs (point (iv) 
above) can be the result of the gas removal by large quasar- 
driven winds (e.g.. Silk & Rees 1998). On one side, this 
hypothesis increases the complexity of the galaxy formation 
process, since star formation, BH accretion, gas inflow and 
outflow are interconnected and occur on quite different space 
and time scales. On the other side, this additional ingredi- 
ent is very helpful. In fact, Granato et al. (2001, 2004) 
show that quasar-driven winds (also named quasar feedback) 
can explain the observed a-enhancement of massive ETGs, 
the large number of submillimeter-selected galaxies showing 
huge star formation rates > lOOOM© yr"' (e.g., Serjeant 
et al. 2008; Dye et al. 2008) and the presence of massive, pas- 
sively evolving galaxies at z > 1 .5. They also demonstrate that 
quasar winds are very effective in modifying the hierarchy fol- 
lowed by the assembling of DM halos, as they can account for 
the shorter periods of star formation in more massive galaxies 
as required by the observed galaxy stellar mass functions of 
ETGs, which clearly show evidence of the so called downsiz- 
ing (Cowie et al. 1996; Perez-Gonzalez et al. 2008; Serjeant 
et al. 2008). Recently, quasar feedback has been included in 
almost all semianalytic models and numerical simulations of 
galaxy formation, though with different recipes (see Springel 
et al. 2005; Croton et al. 2006; Sijacki et al. 2007; Somerville 
et al. 2008; Johansson et al. 2009). 

As a matter of fact, observations of z > 1 galaxies exhibiting 
high star formation rates find evidence of gas in various states, 
from molecular to highly ionized, with mass of the same or- 
der of the stellar mass (Cresci et al. 2009; Tacconi et al. 2008, 
2010). If such large amounts of gas are removed during the 
quasar activity, then large outflows of metal enriched gas are 
expected. Such massive outflows (with rates Mout ^ IOOOM0 
yr~') have been tentatively detected around quasars (e.g. Sim- 
coe et al. 2006; Prochaska & Hennawy 2009; Lipari et al. 
2009). D'Odorico et al. (2004), studying narrow absorption 
line systems associated to six quasars, have shown that these 
outflows have chemical composition implying rapid enrich- 
ment on quite short timescales (see also Fechner & Richter 
2009). The ejection of most of the baryons initially present 
in protogalactic halos is obviously necessary to explain the 
much lower baryon to DM ratio in galaxies compared to the 
mean cosmic value. 

Fan et al. (2008) argued that rapid expulsion of large 
amounts of gas by quasar winds destabilizes the galaxy struc- 
ture in the inner, baryon dominated regions, and leads to a 
more expanded stellar distribution. An alternative explana- 
tion of the increase in galaxy size calls in minor mergers on 
parabolic orbits that mainly add stars in the outer parts of the 
galaxies from z ^ 2 down to the present epoch (e.g., Nipoti et 
al. 2003; Hopkins et al. 2009b; Naab et al. 2009). 

On the other hand, the fit to the luminosity profile of high- 
z galaxies may miss the outer fainter regions biasing the size 
estimates (see Hopkins et al. 2010; Mancini et al. 2010); how- 
ever, a detailed analysis of a galaxy at z = 1 .91 by Szomoru et 
al. (2010) find no evidence of faint outer envelopes. 

In this paper we discuss critically the ideas proposed so far 
In § 2 we give arguments leading to expect that sizes of ETG 
progenitors are in fact as small as those observed; we also 
discuss the data on ETG sizes as function of redshift, pointing 
out the possibility that the size increase exhibits two distinct 
regimes. In § 3 we discuss the evolution of the velocity dis- 
persion. In § 4 we present the relevant details on the physical 



mechanisms invoked to inflate ETGs by mass loss. In § 5 
we discuss our results, while in § 6 we summarize our main 
conclusions. 

Throughout the paper we adopt the concordance cosmology 
(see Komatsu et al. 2009), i.e., a flat universe with matter den- 
sity parameter Q,m = 0.3 and Hubble constant Ho = 70 km s"' 
Mpc"'. Stellar masses in galaxies are evaluated by assuming 
the Chabrier's (2003) initial mass function (IMF). 

2. COSMIC EVOLUTION IN SIZE OF ETGS 

In this section we present the recent observational evidence 
on the cosmic size evolution of ETGs, and then show that 
small sizes are indeed expected at high redshift if dissipation- 
less collapse of the baryons occurred. 

2.1. Observed size evolution of passively evolving galaxies 

A recent analysis by Maier et al. (2009) of a sample includ- 
ing about 1100 galaxies with Sersic index n > 2.5, spectro- 
scopic redshifts in the range 0.5 < z < 0.9 and stellar masses 
in the range 3 x lO'^M© < < 3 x 10" M© shows that 
the size evolution for galaxies at z ~ 0.7 is within a factor 
/,(0.7) = Re(0)/Re(0.7) < 1.25. For galaxies at 0.7 < z < 0.9 
the size evolution is limited to a factor fr{0.9) < 1.4. Small 
size evolution (fr < 1.3) for redshifts z < 0.8 was previously 
reported by Mcintosh et al. (2005) for a sample of 728 red 
galaxies with Sersic index n > 2.5 and stellar masses in the 
range 3 x 10'' Mq < h^M-, < 3 x 10"Mo; in this case, how- 
ever, the majority of redshifts were photometric. At lower 
redshift, z ~ 0.25, the Brightest Cluster Galaxies (BCGs) ex- 
hibit slow evolution /,• < 1 .3 (Bernardi 2009). 

A size evolution somewhat more pronounced (around 40%) 
than found by Maier et al. (2009) has been claimed by Tru- 
jillo et al. (2007) for massive galaxies > lO^M© at red- 
shift z ~ 0.65. However, restricting the analysis to galaxies 
with spectroscopic redshifts in the range 0.5 < z < 0.8 (91 
galaxies with n > 2.5 and mean stellar mass of 1.8 x 1O"M0) 
we find a mean effective radius of 4.94 kpc. The mean local 
effective radius for galaxies with this stellar mass is around 
6 kpc, implying an increase by a factor /,.(0.65) ~ 1.2. On 
the other hand, the mean effective radius decreases to 3.8 kpc 
for galaxies with the same mean mass but with average red- 
shift z ~ 0.9; in this case the size evolution amounts to a fac- 
tor /r(0.9) « 1.6. Similar results are found by Ferreras et al. 
(2009) for a sample of 195 red galaxies selected in the redshift 
range 0.4 < z < 1 .2. They are, on average, more compact than 
local galaxies with Sersic index n > 2.5 by a factor of only 
/,«1.4. 

A stronger evolution of at fixed stellar mass was reported 
by van der Wei et al. (2008) for a composite sample of 50 
morphologically selected ETGs in the redshift range 0.8 < 
z < 1.2. Since we are interested on the evolution at z < 1 
we have confined ourselves to the 20 galaxies in a massive 
clusters at z ~ 0.83. For these we find, on average, /r(0.83) w 
1 .6, but with a substantial mass dependence: the most massive 
galaxies (dynamical mass within Re of Mdyn ^ 3 x 10" M©) 
fall quite close to the local mass vs. 7?^ relation, while the 
lower mass galaxies tend to exhibit large size evolution. 

All the results mentioned above are shown in Fig. [1] where 
we also present a compilation of the data at redshift z > 1. 
We note that, while the data points at z < 1 are averages over 
large samples, at higher redshift data points refer to individual 
galaxies. 

Assuming that the average evolution of Re can be described 
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Fig. 1. — Evolution of the effective radius witli redshift. The data points show: average sizes of z < 1 passively evolving galaxies, divided by the local sizes 
of galaxies of equal stellar mass, in the samples by Trujillo et al. (2007), Mcintosh et al. (2005), van der Wei et al. (2008), Maier et al. (2009) and Bernardi 
(2009), with the associated errors; individual data and error bars for passively evolving galaxies with spectroscopic redshifts z > 1 by Longhetti et al. (2007), 
Cimatti et al. (2008), Damjanov et al. (2009), Mancini et al. (2010), and van Dokkum et al. (2008); data and error bars for individual star forming galaxies 
with spectroscopic redshifts z > 2 by Tacconi et al. (2008) and Law et al. (2009). The color of the data points refers to the stellar mass of the galaxy: red is for 
M, > 10" Mq, blue for 3 X lO'^M© <M* < 10"Mq, and green for M* < 3 x IQ'° Mq. The shaded area reflects the distribution of local SDSS galaxies (Hyde 
& Bernardi 2009). Thick solid lines with arrows illustrate typical evolutionary tracks of massive galaxies according to our reference model (with fa = 1.5). 



by a power law of the form (I +z)", Buitrago et al. 

(2008) find a = 1 .48, while van der Wei et al. (2008) obtain a 
lower value a=l .20. The latter authors also suggest a weaker 
evolution, corresponding to a = 0.96, for z < 1. However, 
even this milder evolution is faster than indicated by the most 
recent data summarized above, and especially by the most ex- 
tensive and spectroscopically complete study of Maier et al. 

(2009) . 

A relevant feature of the data for massive galaxies at high 
redshift is the quite large spread of the size, as it is appar- 
ent from Figs. [l]and|2] Specifically, for masses larger than 
10" M0 the scatter in size of high redshift ETGs amounts to 
^iog(R,) ~ 0.41, significantly wider than in local samples, for 
which we have typically (Tiog{R,) ~ 0.14 (cfr. Shen et al. 2003; 
Hyde & Bernardi 2009). In more detail, several high red- 
shift galaxies exhibit the same size as their local counterparts 
(see e.g. Mancini et al. 2010; Onodera et al. 2010), while 
about half ETGs exhibit /r ^ 3 - 4, with several of them hav- 
ing /,. > 8 - 10. It is worth noticing that Maier et al. (2009) 
find for the size distribution at fixed mass of their sample of 
ETGs at redshift w 0.7 a statistical dispersion <Jiog(R,) ~ 0.16 
very close to the local one. 

Provided that the presently available data constitute a rep- 
resentative sample of the size of high redshift ETGs, both the 
average increase of the size and the narrowing of its distri- 



bution are to be accounted for Only large samples of high-z 
ETGs will allow us to assess the interesting issue of their size 
distribution. We also note that the paucity of data at z > 1 
prevents the investigation of the possible mass dependence, a 
crucial aspect for any interpretation of the phenomenon. 

So far we have discussed the evolution by comparing high 
redshift size determinations with the average size of local 
ETGs. A bias may arise because high-z samples of passively 
evolving galaxies pick up objects that formed at higher red- 
shifts and therefore have smaller sizes. The majority of local 
ETGs probably formed at 1 .5 < Zform ^ 2.5, but ETG progen- 
itors already in passive evolution at z ~ 2-2.5 formed about 1 
Gyr earlier, i.e., at Zform ^ 3.5. The latter are expected to have, 
on average, a factor w 1.5-2 smaller size than local ETGs 
(see Eq. |5]below). This may explain why Valentinuzzi et al. 
(2010) find that a substantial fraction (around 22%) of ETGs 
in local galaxy clusters (overdense regions were the galaxies 
typically formed earlier than in the field) are more compact 
than the local average. In fact, their cluster galaxies are on 
average 1.5 Gyr older than local ETGs with 'normal' size. It 
is worth mentioning that several massive blue galaxies have 
recently been found to exhibit compact sizes (Trujillo et al. 
2009). 

2.2. Sizes of high-redshift star forming galaxies 
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Fig. 2. — Correlation between effective radius and stellar mass. The observations of passively evolving galaxies with spectroscopic redshifts z > 1 by Longhetti 
et al. (2007), Cimatti et al. (2008), Damjanov et al. (2009), Mancini et al. (2010), and van Dokkum et al. (2008) are compared with the local correlation (Hyde 
& Bemardi 2009; the dotted line illustrates the average and the shaded area represents the variance). Color code refers to the stellar mass, as in previous Figure. 
The dashed lines illustrates the outcomes of Eq. (5) for extreme values of the relevant parameters fc and zsoim (see text). Thick lines with arrows illustrate 
typical evolutionary tracks of massive galaxies according to our reference model (with /o- = 1.5 and zioim = 3), featuring first the abrupt size growth due to quasar 
feedback (almost vertical arrows), and then the possible slow size increase due to mass loss (an'ows pointing left) or mass additions by minor mergers (right 
pointing arrows.). 



Massive starforming galaxies at high-z are heavily obscured 
by dust and therefore their structure cannot be investigated 
by means of optical or near-IR observations; however, one 
can resort to interferometric observations at millimeter and 
submillimeter wavelengths. In particular CO molecular emis- 
sion has been spatially resolved for a sample of submillimeter 
bright galaxies at z « 2 by Bouche et al. (2007) and Tacconi 
et al. (2008, 2010) with the IRAM Plateau de Bure millime- 
ter interferometer. The results obtained for the galaxies with 
spectroscopic redshift are shown in Fig.[T] For these objects 
the dynamical mass is a good proxy for the mass in stars and 
gas. 

In Fig. [T] we also plotted the data of Law et al. (2009) on 
a sample of Lyman break galaxies at redshifts 2 < z < 3 and 
with kinematics dominated by random motions at least in the 
central 2-3 kpc. In this case, refers to the Ha or [OIII] 
emissions, which are sensitive to dust extinction. The light 
distribution is expected to be irregular and knotty, as in fact 
it is observed. Since the dust distribution inside starforming 
galaxies follows the star and gas distributions, which peak in 
the central regions, we expect that the observed light profile 
is broadened with respect to the true star and gas distribution 
for rest-frame wavelengths shorter than a few microns (see 
Joung et al. 2009). Therefore the estimated half-light radii of 
Ha or [OIII] emissions should be considered as upper limits. 



Nevertheless, Fig. [T] suggests that large starburst galaxies and 
high-z passively evolving galaxies, their close descendants, 
exhibit the same trend of smaller size with respect to the local 
ETGs. 



2.3. Expected sizes of high redshift galaxies 

Caon et al. (1993; see also Kormendy et al. 2009) showed 
that the Sersic (1963) function: 

/(r) = 7(0) e"''" , (1) 

fits the brightness profiles of nearly all ellipticals with remark- 
able precision over large dynamic ranges. Here 7(0) is the cen- 
tral surface brightness, is the half-luminosity radius, and n 
is the Sersic's index. The constant b„ can be determined from 
the condition that the luminosity inside R^ is half the total lu- 
minosity L(Rc) = Lt/2 (see Prugniel & Simien 1997). The 
classical de Vaucouleurs (1953) profile corresponds to n = 4. 

If light traces mass, the projected half stellar mass radius R^ 
is related to the gravitational radius Rg by R,, = Ss{n)Rg, and 
the density-weighted, 3-dimensional velocity dispersion (T,^ 
is related to the observed line-of-sight central velocity disper- 
sion (Jo by (T* = [3 Sf:(n)y^^ ao. Note that ctq is usually mea- 
sured within a physical size of about 0. 1 7?e (e.g., J0rgensen et 
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al. 1993). At virial equilibrium, the mass is given by: 



Rii and mass Mu- 



(2) 



SAn) (M*+M„, 



where Soin) = iSK{n)/Ss{n). Prugniel & Simien (1997) have 
tabulated the coefficients So, Sk, and 5j(n) for values of the 
Sersic index n ranging from 1 to 10; in particular, 5j(4) « 
0.34, Sk(4-) w 0.52, and 5/5(4) = 4.591. 

The stellar component gravitationally dominates in the in- 
ner regions of galaxies, while the DM with its extended halo 
dominates in the outer regions. To a halo with mass Mh 
we can associate an initial baryon mass Mb.i = /aMh, where 
fb « 0.2 is the cosmic baryon to DM mass ratio. Weak lensing 
observations (Mandelbaum et al. 2006), extended X-ray emis- 
sion around ETGs (e.g., O'Sullivan & Ponman 2004) and the 
comparison of the statistics of the halo mass function with the 
galaxy luminosity function (Vale & Ostriker 2004; Shankar et 
al. 2006; Guo & White 2009; Moster et al. 2010) point to a 
present-day ratio m = Mh/M* « 20-40 between the total halo 
mass to the total mass in stars for red massive galaxies. This 
result quantifies the inefficiency of the star formation process 
even in large galaxies, since M*/Mb,i = l/{mfh). The depen- 
dence of m on mass and redshift predicted by our reference 
model (Granato et al. 2004) is discussed in the Appendix. 

As for the inner regions, we consider the ETG progenitor 
1255 - at z « 2.2, for which van Dokkum et al. (2009) and 
Kriek et al. (2009) find ^ 2 x 10" Mq within the half- 
light radius R^ « 0.8 kpc; local ETGs with the same mass 
have an average size Re ^7 kpc (see Shen et al. 2003; Hyde 
& Bernardi 2009). If m « 30, we can associate to 1255-0 a 
halo mass Mu w 6 x 1O'^M0. Assuming a NEW profile with 
concentration c = 4, it is easy to see that inside the gravita- 
tional radius Rg ^ 3Re ^ 3 kpc, the DM fraction amounts to 
/dm ~ 0.1. We notice that the DM contribution to the mass 
within the half-light radius Rg of local ETGs is /dm ^ 30% 
(e.g., Borriello et al. 2003; Tortora et al. 2009, Cappellari et 
al. 2006), and has a small effect on the stellar velocity dis- 
persion. This supports the notion that the dissipationless DM 
cannot parallel the dissipative collapse of baryons, so that its 
gravitational effects within 7?^ can be neglected also during 
the compact phase of galaxy evolution. 

After the dissipative collapse of baryons inside a host DM 
halo, the gravitational radius of the baryonic component, stars 
plus gas with mass M^, and Mgas respectively, reads 



R. 



GM, (l+/gas) 



(3) 



where /g^s = Ma^^/M^, is the gas to star mass ratio within the 
gravitational radius. We note that the central gas mass in- 
clud es th e cold component as estimated in the Appendix (see 
Eq. IIA4I and below for analytic approximations). 

If before collapse the baryons had the same velocity disper- 
sion crb,i as the DM ctdm, and taking into account that ctdm 
is approximately equal to the halo rotational velocity Vh (see 
Appendix), the 3-D stellar velocity dispersion cr* at the end of 
the collapse can be written as (Fan et al. 2008): 



<^*= fa Crb,i= fa CTDM ~ fa Vr 



(4) 



Recalling that Rg = Ss{n)Rg, with Rg given by Eq. (|3]l, and that 
= GMu/Rr we then obtain Rg in terms of the halo radius 



^0.9 



/^ Mh 
SAn) 25 /L5 
034 ^ [j^ 



Rh 



lOi^MfT 



1/3 



1+Zfo 



(5) 
kpc, 



where Zfom, is the redshift when the collapse begins, and we 
have set /gas = 1 . This equation shows that the baryon collapse 
naturally leads to kpc or sub-kpc effective radii and to stel- 
lar velocity dispersions higher than halo rotational velocities 
(fa > 1). Both these properties differ from those observed for 
local massive galaxies, implying that other ingredients have 
come into play. 

The explicit redshift dependence of Rg in Eq. (|5]l comes 
from the halo radius, which scales as (l+Zfoim)"'- In addi- 
tion, the ratio m, which measures the star formation ineffi- 
ciency and is determined by the physics of baryons, scales as 
(1 +Zfonn)~'^'^^ (see Appendix). As a result the effective ra- 
dius scales like Rg (x{l +Zform)~''^^- The values of m and of 
/cr depend on how and when the star formation and gas heat- 
ing processes can halt the collapse. The latter must proceed 
at least until the mass inside the Rg is dominated by stars, as 
observed in local ETGs. 

In Fig. |2]we compare the observed distribution of local and 
high-z galaxies in the Rg vs. plane with expectations from 
Eq. (|5]l. A robust upper limit to the high-z correlation (upper 
dashed line in Fig. |2]i is obtained setting /j = 1 (baryon col- 
lapse with no increase of the stellar velocity dispersion) and 
Zform = 1, corresponding to a look-back time of about 7-8 
Gyr, a lower limit to the mass-weighted age of local massive 
ETGs (Gallazzi et al. 2006; Valentinuzzi et al. 2010). The 
corresponding line falls just at the lower boundary of the dis- 
tribution of local ETGs, but at the upper boundary of the dis- 
tribution of high-z passively evolving massive galaxies. The 
median size of the latter is a factor around 4 lower than that of 
local galaxies with the same stellar mass. This argument is not 
in contrast with the existence, recently reported by Valentin- 
uzzi et al. (2010), of local compact ETGs, which can repre- 
sent the evolution of the oldest, most compact progenitors. 

The lower bound to the high-z correlation is less well de- 
fined; in Fig.|2]the lower dashed line corresponds to Zform ^ 5 
and to /cr = 2. In Eq. Q, we have adopted as our reference 
values fa = 1.5 and Zform = 3. 

3. EVOLUTION OF THE ETG VELOCITY DISPERSIONS 

Applying the virial theorem to the galaxies before and after 
their growth in size, we have that the final line of sight central 
stellar velocity dispersion ctq f is related to the initial one ctq i 
by: 

2 ^ 2 Soind Mf Rgj ^ /^y^(Zforn.) Spim) Mf Rgj _ 

'^"■■^ Soirif) Mi Rg,f 3SK(nd Soirif) Mi Rgj ' 

(6) 

here the indices / and / label quantities in the initial and fi- 
nal configuration, and Soin) is the structure factor defined by 
Prugniel & Simien (1997; see Eq. Ill), n being the Sersic in- 
dex. In the last expression we have used Eq. Q and the rela- 
tion (jI = 3 SKim) (Tq , . In the case of an homologous growth of 

the galaxy size, the velocity dispersion scales as (M/r)'/^, so 
that it remains constant if both the mass and the size increase 
by the same factor and decreases as r~'/^ if the growth occurs 
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at constant mass. However, the size growth is not necessar- 
ily homologous. All the mechanisms so far proposed predict 
an increase of the Sersic index with increasing size. This ef- 
fect together with a possible increase in mass within the limits 
imposed by the mass function evolution, tend to soften the de- 
crease of the velocity dispersion. A further attenuation of the 
evolution is expected because of dynamical friction with the 
DM component. 

The above equation shows that the size evolution of ETGs 
is paralleled by velocity dispersion evolution and that the 
present-day velocity dispersion keeps track of the potential 
well of the host halo when the galaxy forms. This is expected, 
since in a galaxy halo the gas is channeled toward the central 
regions during the fast accretion phase under the effect of the 
DM potential well. The duration of the star formation pro- 
cess depends on halo mass, feedbacks and redshift at which 
the fast accretion phase occurs. The velocity dispersion of the 
collapsed galaxy is not affected by the minor fraction of DM 
added subsequently to the external regions of a halo during 
the slow accretion phase (see Lapi & Cavaliere 2009). 

Observations of the kinematics of passively evolving ETGs 
at z > 1 are quite difficult. Nonetheless for two galaxies re- 
liable estimates of the velocity dispersion have been obtained 
(Cappellari et al. 2009; van Dokkum et al. 2009). For 
other objects only average estimates have been inferred from 
stacked spectra (Cenarro & Trujillo 2009; Cappellari et al. 
2009). In all cases the main conclusion is that stellar masses 
derived from spectrophotometry are in good agreement with 
virial masses or with masses derived from dynamical models, 
if one adopts an IMF flattening below IMq, such as those 
proposed by Kroupa (2001) or by Chabrier (2003). This find- 
ing is also confirmed at intermediate redshifts 0.4 < z < 0.9 
by van der Wei et al. (2008). 

One of the two high redshift ETGs with a good determi- 
nation of the velocity dispersion, GMASS 2470 (Cappellari 
et al. 2009), falls in the a vs. plane quite close to the 
area covered by local ETGs. On the other hand, the best fit 
value of the stellar velocity dispersion, ctq = SlOiligg^, for the 
galaxy 1255-0 at z w 2.2 (van Dokkum et al. 2009) exceeds 
the measured values for even the most massive local galaxies 
(Bernardi et al. 2008). Although we cannot do statistics with 
a single case, its existence lends support to the possibility of 
a significant evolution of the galaxy velocity dispersion. 

Cappellari et al. (2009; see also Bernardi 2009), based on 
stacked spectra of 13 galaxies at 1.4 < z < 2.0 (cf. their Ta- 
ble 1), find a mild evolution of the velocity dispersion, that 
decrease from about 202 km s"' at z ~ 1 .6 down to about 160 
km s~' at z « for « 7 X lO'^M©, and find an increase 
of the source size by a factor around 3.5. This evolution can 
be understood if the Sersic's index n increases from an ini- 
tial value n,- to ny = n, -l-2 and the mass increases by 30%; in 
that case the velocity dispersion decreases by a factor 1 .35, or 
less if dynamical friction with DM has a role. Note that if, as 
shown in § 12.31 the DM is dynamically irrelevant in the inner 
regions of galaxies, and mergers accrete matter in the outer 
regions, the mass within does not change and the same ve- 
locity dispersion evolution applies also to the minor merger 
scenario. 

A quite interesting upper limit to the velocity dispersion, 
cr* < 326 km s~', for a massive « 3-4 x lO^M© at red- 
shift z ~ 1.82 has been found by Onodera et al. (2010). The 
same authors also find that the size of this galaxy is as ex- 
pected for a local galaxy with the same mass. The veloc- 



ity dispersion and the size yield a virial mass upper limit 
Mu < 7 X 1O"M0, quite close to the stellar mass. This galaxy 
has the same structural properties of a local ETG. Recalling 
that a significant fraction of z > 1.5 galaxies already exhibit 
a size close to the size of their local counterparts, this galaxy 
appears as a well studied case of an already evolved galaxy, 
suggesting that the timescale for the size evolution is shorter 
than the Hubble time at those redshifts AT^ize < tniz). 

4. PHYSICAL MECHANISMS FOR SIZE EVOLUTION 

Both theory and observations suggest that at least 60% of 
ETGs evolve in size by at least a factor of 2-4. So far, two 
main mechanisms have been proposed to accomplish such 
evolution. One possibility is that the expansion is driven by 
the expulsion of a substantial fraction of the initial baryons, 
still in gaseous form, by quasar activity (Fan et al. 2008) or 
by an expulsion of gas associated to stellar evolution (e.g., 
Damjanov et al. 2009). The two mechanisms differ in the ex- 
pulsion timescale, which is shorter than the dynamical time if 
it is triggered by quasar activity and longer in the case of ejec- 
tion associated to stellar evolution (with 'standard' IMFs). 

Alternatively, the increase in size could be due to minor 
mergers on parabolic orbits that add stars in the outer parts of 
the galaxies along the cosmic time from z ~ 1 -2 to the present 
epoch (see Mailer et al. 2006; Naab et al. 2009; Hopkins et 
al. 2009b; van der Wei et al. 2009). Major mergers (i.e., 
mergers of galaxies with similar mass) can also increase the 
galaxy size in a way almost directly proportional to the mass 
increase and they were also considered (e.g., Boylan-Kolchin 
et al. 2006; Naab et al. 2007) but the required space den- 
sities of progenitors were found to be incompatible with the 
present-day galaxy mass function (Bezanson et al. 2009; Toft 
et al. 2009) as well as with the dearth of compact, massive 
galaxies in the local universe (Trujillo et al. 2009). 

A third possibility is that the increase is illusory, because 
the low-surface brightness in the outer regions of high-z 
galaxies may be missed and the effective radii are correspond- 
ingly underestimated (Mancini et al. 2010; Hopkins et al. 
2010) or because a gradient in the M/L ratio (lower in the 
bluer central regions) can make the half-light radius in the op- 
tical smaller than the half-mass radius (Tacconi et al. 2008); 
however, Szomoru et al. (2010) find no evidence of outer faint 
envelopes in a well-studied galaxy at z ~ 1 .9. 

4. 1 . Gas expulsion 

In the case of gas expulsion the final size depends on the 
timescale of the ejection itself. If the ejection occurs on a 
timescale shorter than the dynamical timescale of the system 
Tej < Tdyn, immediately after the ejection the size and velocity 
dispersion are unchanged but the total energy is larger because 
the mass has decreased. The system then expands and evolves 
towards a new equilibrium configuration. In the case of ho- 
mologous expansion the final size Rf is related to the initial 
one Ri by (Biermann & Shapiro 1979; Hills 1980): 



where Mgj is the ejected mass and Mf is the final mass. 

This simple result has been confirmed by numerical simu- 
lations of star clusters (e.g., Geyer & Burkert 2001; Boily & 
Kroupa 2003). In particular, the simulations by Goodwin & 
Bastian (2006) and by Baumgardt & Kroupa (2007) show that 
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the expansion of the half-mass radius occurs in about 20 dy- 
namical times and the new final equilibrium is attained within 
40 dynamical times. We note that the case of galaxies differs 
from that of the star clusters owing to the presence of the DM 
halo. In ETGs the DM halo exerts its gravitational influence 
outside the central region dominated by stars and prevents the 
galaxy disruption when Mgj approaches or exceeds Mj; the 
DM potential can also influence the time taken by the stars to 
reach the new equilibrium. 

When the mass loss occurs on a timescale longer than the 
dynamical time the system expands through the adiabatic in- 
variants of the stellar orbits and one gets 



■ = 1 + - 



M 



f 



(8) 



Comparison of the two above equations show that the fast ex- 
pulsion is more effective in increasing the size. 
The dynamical time of the stellar component is 



7'dyn = 7r 



'/2 / ^ x3/2 /i^ii,^ X 1/2 



! 3 X 10" 



\ 3/2 

1 kpcj 



lO'^M, 



o 



yr, 



(9) 

about 30 - 50 times shorter than the typical dynamical 
timescale in local massive ETGs and not much longer than 
the dynamical timescale usually associated to star clusters. In 
the case of mass loss due to stellar feedback (Hills 1980; Rich- 
stone & Potter 1982) Tq ^ Tdyn for any reasonable choice of 
the IME For instance, if a Chabrier (2003) or Ki-oupa (2001) 
IMF is adopted, after an initial burst about half of the mass 
of formed stars returns to the gaseous phase over a timescale 
of 1 Gyr If this gas is removed from galaxies, the size may 
grow by a factor of about 2. The higher the proportion of mas- 
sive stars, the larger the effect on the size, and the shorter the 
timescale for the size expansion (see Damjanov et al. 2009). 

In the case of quasar winds the typical timescale for gas 
ejection can be estimated as 



wind 



6 ^_m^2/3 
25. 



2 X lO^Mf. 



-3/2 



l+Z 



5/3 



(10) 



where Mwind is given by Eq. (A12), and we have assumed 



Mgas ~ Mi,. An alternative definition of Tej is 



^«10" 



Ikpc 



1/2 



-1/2 



yr, 



(11) 



where V} = IGM^jRe is the escape velocity from the radius 
Re- With both definitions the ejection timescale is of the order 
of the dynamical timescale. 

It is apparent that numerical simulations are badly needed 
to investigate the detailed effect of quasar winds on the size 
and on the timescale ATsize to reach the new equilibrium; such 
kind of simulations are underway (L. Ciotti and F. Shankar 
2009, private communication). 

4.2. Minor mergers 

In the case of minor mergers on parabolic orbits the ini- 
tial potential energy of the accreting mass is neglected in the 
computation. Following Naab et al. (2009) we assume that 
random motions are dominant in high-z ETG precursors and 



set r\ = Ma /Mi and e = cr^/crf, the / and a indices referring 
to initial and accreted material. The mass after merging is 
therefore Mf = Mj(l + rj). If r oc M", the virial theorem gives 
e = 7^'"". Local ETGs have a « 0.56 (Shen et al. 2003) or 
even larger in the case of BCGs (Hyde & Bernardi 2009); 
in addition, a value a « 0.5 is implied by the Faber-Jackson 
(1976) relationship. 

From the virial theorem and the energy conservation equa- 
tion it is easily found that the fractional variations of the gravi- 
tational radius and of the velocity dispersion between the con- 
figurations before (/) and after (/) merging are: 



R. 



(i+vf 



Rgi (l+r?2-") 



CTy _ (1-1-772-") 



(12) 



a} 



(I + 77) 



Boylan-Kolchin et al. (2008) showed that minor mergers can 
be effective only if 77 > 0.1, lower mass ratios requiring too 
long timescales. 

Recent numerical simulations by Naab et al. (2009) agree 
with these results. Their simulated galaxy, with a mass in stars 
PS 8 X 10 "'Mq and half-mass radius « 1 kpc at z w 2, 
by z = has doubled its stellar mass through minor mergers, 
reaching « 1.5 x 10" M0, while the half-mass radius has 
increased by a factor 2.7. The simulations also suggest that 
most of the increase, a factor of about 1.8, occurs at z < 1, 
i.e, on a cosmological timescale. This is accompanied by a 
moderate decrease, < 20%, of the central velocity dispersion 
between z ~ 3 and z ~ and by a decrease of the central den- 
sity of stellar distribution with time, due to dynamical friction, 
despite of the total mass increase. However, these simulations 
yield a present-day half-mass radius a factor of 2 smaller than 
expected on the basis of the vs. R^ relationship of Shen et 
al. (2003; see also Fig. |2]). 

We notice that the size evolution in the merging case occurs 
on timescale which is comparable with the present Hubble 
time with size scaling cx (1 -l-z)*^; in the simulations of Naab 
et al. (2009) /3 « 1 holds, and similarly in the findings of van 
Dokkum et al. (2010) /3 w 1.27. 

5. DISCUSSION 

The observational data and the theoretical arguments sum- 
marized in the previous sections allow us to test and constrain 
the different models for size evolution. Since a size increase 
by minor dry mergers implies an increase in mass, we start by 
discussing limits on the latter 

5.1. The mass evolution of ETGs 

Spectral properties of local ETGs with stellar masses > 
3 X 10'*^ M0 indicate that their light-weighted age exceeds 
8-9 Gyr, independently of the environment (see Renzini 2006 
for a review and Gallazzi et al. 2006 for an extensive sta- 
tistical study). Since light-weighted ages are lower limits to 
mass-weighted ages (e.g., Valentinuzzi et al. 2010), it is gen- 
erally agreed that most of the stars of massive ETGs formed at 
Zform ?J 1.5-2. An upper limit < 25% to the fraction of stars 
formed in ETGs in the last « 8 Gyr, and as a consequence 
to the fraction of gas accreted at intermediate redshift z < 1, 
has been derived from studies of narrow band indices of local 
field ETGs (e.g., Annibali et al. 2007). Moreover, all mas- 
sive galaxies that formed and gathered the bulk of their stars 
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at z > 1 are presently ETGs or massive bulges of Sa galaxies, 
since there are no late-type, disc-dominated galaxies endowed 
with so large masses of old stellar populations. 

Thus by comparing the stellar mass function of local ETGs 
to the mass function of all galaxies at z < 1.5, we can derive 
information on the mass evolution. 

Bernai-di et al. (2010) studied in detail about 2000 
morphologically-classified local galaxies extracted from the 
SDSS sample (see Fukugita et al. 2007). They showed that 
the concentration index Cr can be used to discriminate among 
galaxy types. The criterion Cr > 2.86 includes almost all 
ellipticals, about 80% of SO galaxies and 40% of Sa galax- 
ies, Ukely those with a larger disk component of younger and 
bluer stars. Correspondingly, while the fraction of E and SO 
massive galaxies (Mdyn ^ 10" M0) older than 8 Gyr is quite 
large, the fraction of massive and old Sa galaxies is less than 
50% (cf. Fig. 23 of Bernardi et al. 2010). 

Therefore, in order to consistently compare the high red- 
shift mass function with the local one, in Fig. 3 we report the 
cumulative mass function for the full Bernardi et al. (2010) 
sample with concentration index Cr > 2.86. We also plot the 
mass function by Cole et al. (2001; similar results were ob- 
tained by Bell et al. 2003), that has been computed with 
criteria that tend to exclude late type galaxies. These local 
mass functions are compared with estimates by Pozzetti et al. 
(2007) and Marchesini et al. (2009), which refer to redshift 
z w 1 .4 and z « 1.6, respectively. All mass functions have 
been rescaled to Chabrier's (2003) IMF 

When comparing the local to the high redshift ETG mass 
function, the first issue is how to make a complete census of 
high redshift ETG progenitors. Williams et al. (2009) found 
that in a deep sample (magnitudes K\b < 22.4) the most lumi- 
nous objects at z ^ 1-2 are divided roughly equally between 
starforming and quiescent galaxies. A significant fraction of 
galaxies at z > 1.5, forming stars at rates of hundreds to thou- 
sands solar masses per year as revealed by far-IR or submil- 
limeter surveys, are easily missed even by deep /T-band sur- 
veys because of their strong dust obscuration (e.g.. Dye et al. 
2008). An extreme example is GNIO, a galaxy at z ~ 4 that 
exhibits a star formation rate around IOOOM0 yr~', a stel- 
lar mass around 10" and a dust extinction Ay ~ 5-7.5 
mag (see Daddi et al. 2009; Wang et al. 2009); this went 
undetected by ultradeep ^Tj-band exposures, yielding a 1 - cr 
upper limit of 23 nJy (Wang et al. 2009) that corresponds to 
A'ab 28. Since all galaxies at redshift z > 1 .5 have to be in- 
cluded in the budget, mass functions of high redshift massive 
galaxies based on optical or near-IR selected samples should 
be regarded as lower limits to the high redshift counterparts 
of local ETGs (see Silva et al. 2005 for a more detailed dis- 
cussion). As a consequence, only upper limits to evolution in 
mass allowed by the data obtains by assuming that the galaxy 
number density keeps constant. 

In addition, it is apparent from Fig. 1 of van Dokkum et al. 
(2010) that the upper limit to mass evolution slightly depends 
on mass or correspondingly on the reference number density, 
van Dokkum et al. (2010) adopt as a reference number density 
2 X lO""* Mpc"^ dex"' and find that galaxies with this num- 
ber density at z « 1.6 are endowed with M^, « 1.7 x 1O"M0, 
while at z ~ 0.1 the same number density pertains to galaxies 
with w 2.8 X 10" Mq (the adopted local number density 
is that of Cole et al. 2001); the ensuing upper limit to mass 
evolution is < 70%. Applying the same argument to galaxies 
with number density 2 x 10"^ Mpc"^ dex"' yields an upper 



limit to mass evolution < 40%. If the local number density of 
Bernardi et al. (2010) is adopted, the upper limit to mass evo- 
lution is < 50% since z ~ 1 .6 with practically no dependence 
on mass, as shown in Fig. [3] 

These upper limits are compatible with evidences that most, 
if not all, massive ETGs are already in place at redshift z ~ 1 
(see Drory et al. 2005; Perez-Gonzalez et al. 2008; Fontana 
et al. 2006; Cirasuolo et al. 2010; Kajisawa et al. 2009) 
and that only a fraction < 30% of their stellar mass can be 
added at later times. Collins et al. (2009) have estimated the 
masses of the Brightest Cluster Galaxies (BCGs) in 5 of the 
most distant X-ray-emitting galaxy clusters at redshifts z ^ 
1.2- 1.5, finding that they are perfectly compatible with the 
local average mass of BCGs. If the two galaxies, which have 
companions, incorporated them, their mass would increase in 
one case by about 20% and in the other by 40%. 

The results of numerical simulations on DM halos are com- 
patible with such a mass increase. More in detail, Boylan- 
Kolchin et al. (2008) showed that only merging of satellites 
with mass ratio i] >0.1 can efficiently increase the mass of 
their host galaxies. Also the merging rate for massive galaxies 
inferred from numerical simulations by Stewart et al. (2008) 
confirms that most of the mass is added by merging of satel- 
lites with mass ratio 77 > 0.1. We stress, however, that these 
simulations refers to the DM halos and its translation to stellar 
component of merging halos is not trivial. 

To sum up, the data allow, at most, for a mass increase by 
a factor of w 2 since z ~ 2 and by a factor of w 1.5-1.7 
since z ~ 1.5. We notice also that if the growth occurs via 
minor dry mergers, with no evolution of the galaxy number 
density, practically all massive galaxies gradually increased 
their mass throughout their entire lifetime, from the forma- 
tion redshift to z = [see, e.g., the simulations by Naab et al. 
(2009) and Stewart et al. (2008)]. But then also the galaxy 
sizes should increase gradually over the full galaxy lifetime, 
and this can be hardly reconciled with the much larger scatter 
in size observed for ETG progenitors at z > 1.5 compared to 
that at lower redshifts. 

5.2. Size evolution 

The comparison of available data on z > 1 ETGs with the 
local size distribution clearly points toward a mean size in- 
crease by about a factor of 3 in order to bring the average M^, 
vs. R,, relationship of high redshift ETGs to the local average 
(cf. Fig.[T]i, though we caution that larger samples of high red- 
shift ETGs are needed. We stress that the observed small sizes 
at high-z are indeed expected (cf. Eq. ||5|) if ETG progenitors 
formed most of their stars in a rapid, dissipative collapse. 

As a matter of fact, we expect that high redshift passively 
evolving ETGs formed at redshift Zform ^ 4 larger than the 
formation redshift Zfoim ~ 1.5-2 of most local ETGs. From 
the number density of halos with Me > IO'^Mq as a func- 
tion of redshift, we estimate that massive galaxies (with M,^ > 
10" M0) formed at redshift Zform ^ 4 are only 10% of those 
formed at Zfoim ~ 2. Therefore, since oc (1 +Zfoi-ni)"'^'' (see 
§ 12. 3b . the local counterparts of high-z ETGs, 10% of the total 
number of ETGs, are expected to exhibit a half-light radius 
smaller than the average by a factor around 1 .4. 

Taking into account this bias, data in Fig. [T]and in Fig. |2] 
show that a significant fraction of local ETG precursors al- 
ready at z > 1.5 exhibit the same size as their local counter- 
parts of the same mass. On the other hand, there are also ETG 
progenitors much more compact than their local counterparts. 
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Fig. 3. — Cumulative stellar mass function. The different lines illustrate Schechter fits to the average stellar mass function at different redshifts as estimated by 
Bell et al. (2003), Bernardi et al. (2009), Pozzetti et al. (2007), and Marchesini et al. (2009). All estimates have been scaled to Chabrier's (2003) IMF. Thick 
solid lines with arrows highlight the mass evolution from z~ 1.5 to the present allowed by the observed mass functions, starting from M* fa 2 and 5 X lO" Mq. 



with sizes smaller by a factor < 1/6. As a matter of fact, the 
dispersion in size at high redshift is larger than in the local 
samples of ETGs. These properties of the size distribution 
can be accounted for by a model yielding evolution in size 
by large factors (> 5) on timescales shorter than the Hubble 
time f// at z > 1 .5. Ejection of large amounts of gas by quasar 
feedback can reproduce the observed phenomenology. From 
Eq. (7) it is apparent that large size expansions are possible, 
even though gravity of DM halos will constrain them. Also 
from Eq. (9) it is apparent that timescales from a few to sev- 
eral 10*^ yr can be required for the expansion. In this context 
the scatter of sizes mirrors the spread of formation time and 
the spread in the expansion phase, as illustrated by Fig. [Hand 
Fig.|2] In particular, in Fig.[T]lines with an arrow represent the 
size evolution predicted by our reference model. The lower 
horizontal lines represent the time (translated to Az) spent by 
ETG progenitors in their dusty phase with quite large star for- 
mation rate; submm surveys are quite efficient in selecting this 
phase. Then the (almost vertical) lines represent the epoch of 
the large size increase due to the gas outflow triggered by the 
quasar activity; this phase begins with the quasar appearance 
and lasts AT^ize (here ATf^^e ~ 2 x 10** yr has been assumed). 
Then a longer phase lasting for about the present Hubble time 
follows, during which the size can increase by a smaller factor 
because of mass loss due to galactic winds and/or minor dry 
mergers. 

We note that quasar feedback has not been introduced 
specifically to solve the size problem, but first by Silk and 



Rees (1998) to predict the correlation in ETGs between 
galaxy velocity dispersion and the present-day BH mass. 
Soon after, Granato et al. (2001, 2004) have shown that 
the gas removal by quasar activity is also needed in order to 
stop the star formation, preventing formation of exceedingly 
massive galaxi es, t oo blue and with no enhancement of a- 
elements (cf. § 15. lb . Sterilization of star formation by quasar 
feedback implies that in a quite short timescale, an enormous 
mass of gas is evacuated from the central galaxy regions and 
possibly from the entire halo and subhalos. The gas evacuated 
from the central regions can be of the same order of the mass 
in stars, so about 10-20% of the total baryons in the galac- 
tic halo. Observations of high redshift star forming galaxies 
do find evidence of large fractions of gas in various states, 
from molecular to highly ionized; in starforming galaxies at 
z « 2 the mass in gas is of the same order of the mass in stars 
(Cresci et al. 2009; Tacconi et al. 2008, 2010). Such winds 
would then push out gas from the halo at a rate 
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(13) 

where the reference values are for a DM halo with Mh ~ 
2 X 1O'^M0 formed at Zfonn ~ 3; Vgas is the escape velocity 
from r « 1 kpc and the gas mass is close to the stellar mass. 
As pointed out by Silk & Rees (1998) and by Granato et al. 
(2001, 2004) the energy released by a luminous quasar in its 
last e-folding time is a factor of 20 larger than the energy as- 
sociated to these winds. On the observational side, hints of 
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massive outflows from high redshift quasars, consistent with 
this scenario, have been reported (e.g. Simcoe et al. 2006; 
Prochaska & Hennawy 2009). Therefore there are strong rea- 
sons to believe that large gas outflows occurred in high red- 
shift quasar hosts. 

As discussed in § 14.11 the removal of a mass in gas close to 
the mass in stars destabilizes the mass distribution in the in- 
nermost galaxy regions. In the case of strong quasar winds the 
ejection and dynamical timescale are similar Tej « 1 - 3 T^yn- 
Therefore the effect could be intermediate between those de- 
scribed by Eq. ^ and by Eq. ([8]), as found with numerical 
simulations by Geyer & Burkert (2001, cf. their Fig. 3) and by 
Baumgardt & Kroupa (2007). A basic question is how long it 
takes for the stellar structure to readjust to a new equilibrium. 
In simulations of star clusters, i.e., without DM halo, the new 
equilibrium is reached in 30-50 initial crossing times (Geyer 
& Burkert 2001; Boily & Ki-oupa 2003; Bastian & Goodwin 
2006). In the hypothesis that the same number of crossing 
times are also requested for massive galaxies, the expected 
timescale for size evolution would be ATsizs w 1.5 x 10^ yr 
On the other hand, specific numerical simulations with high 
temporal resolution are needed in order to assess the size evo- 
lution timescale, since the presence of the DM halo, domi- 
nating the potential well at r > R^, could slow down the ex- 
pansion increasing the time needed to reach the new size and 
equilibrium. On the observational side, the duty cycle can be 
inferred only by studying the distribution of large samples of 
high-z galaxies in the Re vs. plane. 

Since quasar winds follow the time pattern of quasar shin- 
ing, the same is expected for the size evolution, except for a 
delay by ATsize- As a consequence, the inverse hierarchy or 
downsizing seen in the quasar evolution is mirrored in the size 
evolution. 

Quasar activity is the main feedback mechanism for more 
massive ETGs (M* > 10" Mq), while supernova feedback 
dominates at M* < 2 x lO'^M© (see Granato et al. 2004; Lapi 
et al. 2006; Shankaret al. 2006). Correspondingly, larger size 
evolution is expected for larger mass ETGs, while the evolu- 
tion is progressively decreasing for lower mass and should be 
negligible forM* < lO'^M©; in addition, the scatter in size at 
high z is much wider for more massive ETGs. Interestingly, 
Lauer et al. (2007a) and Bernardi et al. (2007) found that 
the relationship between effective radius 7?^ and luminosity 
steepens for ETGs brighter than My « -2 1 , corresponding to 
a stellar mass 10" M©. 

If we assume that the change in size of ETGs is due to minor 
dry mergers, we face a couple of problems. The upper limit to 
the mass evolution (Mf/Mj « 1.7-2 since z w 1.5-2), plus 
the fact that this happens gradually, implies that almost all 
high redshift massive ETGs must increase their size at most 
by a factor w 2.2-3. While this may be consistent with the 
average size evolution, it does not account for the decreased 
scatter of the size distribution from high to low redshifts. In 
particular, since dry minor mergers require a long timescale 
« 10 Gyr to produce their full effects, they can not not explain 
why a significant fraction of the high-z ETGs are already on 
the local mass-size relationship. Moreover, the upper limit on 
the increase in mass entails an upper limit of factor of 3 for the 
size increase since z w 2, while at this redshift there are ETGs 
with sizes smaller than the local one by factors of 6- 10 . 



5.3. Projected Central Mass evolution 



Clearly, the mass within the central regions after the expan- 
sion driven by quasar winds in ETGs, when a new virial equi- 
librium is reached with the same mass in stars, has to be lower 
than that of the initial compact structure, analogously to what 
happens for stellar clusters (Boily & Kroupa 2003; Baum- 
gardt & Kroupa 2007). To test this implication against the 
data we compare the projected mass within the half-mass ra- 
dius Rf{z ^ 1) of each passively evolving galaxies at high red- 
shift with the average projected mass within the same physical 
radius for local ETGs of the same overall stellar mass. 

We stress that this test bypasses the problem of the relia- 
bility for the estimates of the effective radius in high redshift 
ETGs, since the mass inside the estimated effective radius is 
much less uncertain than the value of the radius itself. We 
checked that following the method by Hopkins et al. (2010), 
who have illustrated the effect of a limited dynamical range 
in surface brightness available for high redshift galaxies on 
the estimate of the intrinsic index «, and of half-mass radius 
R,. These authors shifted some of the Virgo clusters ETGs 
to z ~ 2 and simulated HST observations on these objects. 
Specifically, for NGC 4552 shifted to high redshift and as- 
suming A/i w 4.5, Hopkins et al. (2010) find that the original 
n, = 9.22 is fitted with nj k,6 and 7?y « /?,/3; as a results, the 
total luminosity Lf obtained by the fitting procedure is lower 
than the 'true' one L, by a factor 1 .5. In the case of NGC 4365 
the corresponding luminosity ratio amounts to 1.12. However, 
the luminosity Lf(Rf) inside Rf estimated through the fit is a 
good estimate of the true luminosity, with errors within 20% 
even in the case of large values of n,. For instance, in the 
case of NGC 4552 we get Lf{Rf)/L,(Rf) w 1.15. In conclu- 
sion, while the fitting procedure for high redshift galaxies with 
large intrinsic n index underestimates the half-mass radius by 
a factor of 3-4 and the total luminosity by a factor 1.5, but 
yields accurate estimates of the luminosity (and of the mass) 
inside the radius R y . By the way, the same holds for the stellar 
mass, after proper translation of the luminosity in mass. 

In Fig. 4 we plot the ratio between the projected mass of 
high redshift and local ETGs within the same physical radius, 
namely, the half-luminosity radius of high-redshift ETGs. In 
detail, since for high redshift ETGs M*(< Re(z),z) = M*,tot/2, 
the ratio comes to 



M^{<R,{z),z = Q) 
M,{<Re(z),z) 



= 2T 



2n,b„ 



Reiz) 



{Reiz = 0)) 



(14) 



where F is the (normalized) incomplete Gamma function. 
Thus if the total mass does not change significantly, the quan- 
tity M^,{< Re(z),z)/M-^{< Re{z),z = 0) depends only on the ra- 
tio r = Re(z)/ {Reiz = 0)) and on the final Sersic index n. 

To plot the data points under the hypothesis of no mass evo- 
lution, we compute the ratio r using the observed R^iz), and 
exploiting the observed stellar mass to derive {Re{z = 0)) 
from the local R^-Mi, relation presented in Fig. 2. The hor- 
izontal lines has been computed for three values of r; for the 
sake of definiteness we adopt n = 4. The thick lines with ar- 
rows illustrate the evolutionary track of massive galaxies ac- 
cording to our reference model with /cr = 1.5 and Zfoim = 3; 
these are found to be in encouraging agreement with the dis- 
tribution of data points. 

In the case of local ETGs we can define a ratio r' = R^iz = 
Q)/ {Reiz = 0)) and compute the analogous of the mass ratio 
defined by Eq. (14). The shaded area, containing 65% of lo- 
cal ETGs, illustrates the uncertainty of data points and of the 
horizontal lines associated to the assumption of an average 
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Fig. 4. — Ratio between the projected stellar mass within the estimated half-mass radius Re for passively evolving ETG progenitors at z > 1 and the average 
local value within the same physical radius and for the same overall stellar mass. The data points include observations of individual passively evolving galaxies 
with spectroscopic redshifts z > 1 by Longhetti et al. (2007), Cimatti et al. (2008), Damjanov et al. (2009), Mancini et al. (2010), and van Dokkum et al. (2008). 
The shaded area shows the distribution of local SDSS galaxies (Hyde & Bemardi 2 009) . The thin lines illustrate the expected stellar mass ratio for different size 
increase from high— z to the present, on adopting a local Sersic index n = 4 (see § l5.3l for more details). As in Fig. 2, thick lines with arrows illustrate typical 
evolutionary tracks of massive galaxies according to our reference model (with /o- = 1 .5 and Zfo^ni = 3). 



half radius {Re(z = 0)). 

Data points in Fig. 4 show that a significant fraction of high 
redshift passively evolving galaxies exhibit stellar mass inside 
their inferred half-mass radius larger by a factor 2-6 than 
the mass of their local counterparts within the same physical 
radius. 

Keeping mass and structural index n = 4 constant, larger 
mass ratios can be obtained increasing the half luminosity ra- 
dius by a factor from 2 to 6 (cf. horizontal lines). If we allow 
the structural index n to vary by one, as suggested by the sim- 
ulations of Naab et al. (2009), the change is minimal; even 
structural changes from n = 4 to n = 8 still would require a 
size increase by a factor of ss 6 in order to explain galaxies 
with the largest mass ratio. 

This result suggests that the mass inside this physical radius 
has on the average decreased and disfavors mechanisms that 
increase the size by only adding stars in the outer regions of 
the ETGs. We notice that our argument involves a significant 
fraction of the total galaxy mass. Contrariwise, the compar- 
ison of stellar surface density profiles within Re/ 50 as per- 
formed by Hopkins et al. (2009a) refers only to a tiny fraction 
of the mass. It is interesting to note that Naab et al. (2009) 
find in their simulations that the dynamical friction is able to 
decrease the total mass inside < 1 kpc. 

On the same line, massive ETGs with their large sizes, 
steeper correlation between effective radii and mass and large 



Sersic index (Lauer et al. 2007b; Kormendy et al. 2009) 
clearly stand as representative cases of galaxies which expe- 
rienced robust puffing up by quasar feedback. Moreover, the 
correlation of the central BH mass with Sersic index n (Gra- 
ham et al. 2003; Graham & Driver 2007) for massive ETGs is 
consistent with the hypothesis that the strong feedback from 
the most massive BHs has led to a substantial increase of n. 

5.4. Velocity dispersion evolution 

As mentioned in §[3] velocity dispersion has so far been de- 
termined only for a few individual high-z spheroidal galaxies. 
The galaxy GMASS 2470 at z ~ 1.4 (Cappellari et al. 2009) 
and galaxy #250425 at z = 1.82 (Onodera et al. 2010) are al- 
ready close to the local value for their mass, while 1255-0 at 
z « 2.2 has a best fit velocity dispersion significantly higher 
than the most massive local galaxies. Thus in the two first 
cases evolution should have occurred before the cosmic time 
corresponding to the observed redshift, whereas significant 
evolution in size and velocity dispersion has to occur for the 
higher redshift galaxy. The studies of velocity dispersions by 
Cenarro & Trujillo (2009) and Cappellari et al. (2009), based 
on stacked spectra, suggest that velocity dispersion evolution 
is on the average needed. 

An interesting hint on size and velocity dispersion average 
evolution can be derived by studying the velocity dispersion 
distribution (VDF) of local ETGs (Sheth et al. 2003), follow- 
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Fig. 5. — Velocity distribution function. Our results at different redshifts are compared with the observational estimates of the local velocity distribution 
function by Sheth et al. (2003) and Shankar et al. (2004). 



ing the approach by Cirasuolo et al. (2005). From Eqs. ^ 
and (|7]i, taking n, = 3 and «/ = 5, we find, for the typical val- 
ues of the parameters discusses above = 1.5, r = 4), the 
relation co./ = 0.6 Vh; this associates to each forming halo the 
final velocity dispersion of the most massive hosted galaxy. 
By combining it with the formation rate of halos (see Ap- 
pendix A) for redshifts z > 1 we get the distribution function 
of the stellar velocity dispersions (VDF). From Fig.|5]it is ap- 
parent that the predicted VDF is in good agreement with the 
observational estimate by Sheth et al. (2003). We stress that 
the observed VDF highly constrains the global history of DM 
galaxy halos and of their stellar content, and therefore is an 
important benchmark for models of ETG formation (see also 
Loeb & Peebles 2003). 

A change of the slope of the relationship between luminos- 
ity and velocity dispersion at low luminosity has been claimed 
by several authors (Shankar et al. 2006; Lauer et al. 2007a). 
In particular, Lauer et al. (2007a) show that the change in 
slope occurs at about the same luminosity My « -21, where 
the slope of the size vs. luminosity relation also changes (see 
§ 6.2). Once more this feature occurs at the transition from 
supernova-dominated to quasar-dominated feedback regime. 

An additional relevant aspect is related to the M, vs. ao 
andM, vs. M^, correlations. The Granato et al. (2004) model, 
which we take as a reference, predicts that the mass in stars 
formed before the quasar shining is strictly related to the mass 
of the central BH, since the growth of the reservoir, which 
eventually furnishes the mass to the BH, is strictly propor- 
tional to the starforming activity (cf. Eq. [A20]). Marconi 



& Hunt (2003) and Hiii-ing & Rix (2004) pointed out that the 
M, vs. M^, relation for ETGs and bulges exhibits a scatter 
comparable with that in the M, vs. ctq relation. 

In this context it is interesting to mention that, accord- 
ing to Lauer et al. (2007a), the extrapolation of the cto vs. 
M, relationship holding at low mass to higher mass galaxies 
would predict BH masses smaller than those inferred from the 
stellar mass. This is expected if the velocity dispersion de- 
creased more significantly for higher mass galaxies, hosting 
more massive BHs and subject to stronger quasar feedback. 

6. SUMMARY AND CONCLUSIONS 

The half-luminosity radius of high redshift passively evolv- 
ing massive galaxies is observed to be on the average signif- 
icantly smaller than that of their local counterparts with the 
same stellar mass, but in agreement with theoretical predic- 
tions based on the largely accepted assumption that most of 
the stars have been formed during dissipative collapse of cold 
gas. However, observations also show that the size distribu- 
tion of high redshift ETG progenitors is broader than the cor- 
responding distribution for local ETGs. While a significant 
fraction of massive high redshift ETGs already exhibit sizes as 
large as those of their local counterparts with the same mass, 
for a bunch of ETGs the size has to increase by a factor of 
5 - 10 to match the local half mass radius. Though still scanty, 
the available data on velocity dispersions are suggestive of a 
correspondingly large scatter of the ratios between high-z and 
local values at fixed stellar mass. 

The analysis of several data sets, discussed in §|2l and no- 
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tably of the large sample by Maier et al. (2009) with spectro- 
scopic redshifts, strongly suggests that most of the size evo- 
lution occurs at z > 1, while at z < 1 sizes increase by no 
more than 40%. Moreover, a large fraction of high-z passively 
evolving galaxies have projected stellar mass within their ef- 
fective radii a factor of 2 larger than those of local ETGs with 
t he sa me stellar mass, within the same physical radius (see 

§E3. 

All the above results are easily accounted for if most of the 
size evolution is due to a puffing up driven by the rapid ex- 
pulsion of large amounts of mass, as proposed by Fan et al. 
(2008). That most of the baryons initially associated to the 
DM halo have to be expelled is strongly indicated by the fact 
that the baryon to DM mass ratio in galaxies is much smaller 
than the cosmic value. The quasar-driven winds advocated by 
Fan et al. (2008) occur in the most massive galaxies, while 
below Rs 2 X 10 "'Mq the dominant energy input into the 
interstellar medium comes from supernova explosions which 
induce a slower mass loss. The Fan et al. (2008) model 
therefore predicts a milder size evolution for the less mas- 
sive spheroidal galaxies, while the size evolution of the more 
massive galaxies should parallel the quasar evolution, with a 
delay of about 0.5- 1 Gyr The dichotomy between low- and 
high-mass galaxies, i.e., between supernova and quasar driven 
feedback, is mirrored in the increase of Sersic index with stel- 
lar mass, in the flattening of the total mass vs. velocity disper- 
sion relation toward massive galaxies, and in corresponding 
steepening of the correlation between effective radii and stel- 
lar mass. 

The alternative explanation invoking minor mergers faces 
a couple of difficulties (see also Nipoti et al. 2009a,b). The 
analysis of the mass function evolution shows that, under the 
hypothesis of pure mass evolution, the upper limits to the 
mass increase are a factor « 2 and a factor « 1 .7 since z ~ 2 
and z ~ 1.5, respectively. Also the increase is expected to 
be gradual and rather uniform, so that practically all galax- 
ies undergo the same mass increase. As a consequence al- 
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most all high redshift massive ETGs must evolve by a fac- 
tor < 2.2-3. While this upper limit may be consistent with 
the average evolution of the size, the model does not account 
for the substantially broader size distribution, for given stel- 
lar mass, at high, compared to low, redshifts. In particular, 
since dry minor mergers require a long timescale « 10 Gyr 
to produce their full effects, they can not not explain why a 
significant fraction of the high-z ETGs are akeady on the lo- 
cal mass-size relationship. Moreover, the upper limit in mass 
entails a factor of < 3 in size evolution since z ~ 2, while at 
the same redshift there are ETGs with half mass undersized 
by a factor 6- 10. On the positive side, for the minor merger 
scenario, the simulations by Naab et al. (2009) show that dy- 
namical friction is able to remove part of the mass from the 
central regions in line with what suggested by observations 
(see Fig. 4). 

The virial theorem tells us that the velocity dispersion 
scales as a- oc SoM/r, where So is a structure factor defined 
by Prugniel & Simien (1997) in the case of a Sersic profile. 
Since the rapid loss of a large mass fraction destabilizes the 
mass distribution, it may be expected that the final equilib- 
rium configuration differs from the initial one. In fact, the 
data by van Dokkum et al. (2010) and simulations indicate 
that the Sersic index of local galaxies is, on average, higher 
than for high-z galaxies. If so, the variation of So partially 
compensates the effect on a of the size increase. Although 
measurements of the kinematic properties of high-z galaxies 
are scarce, a velocity dispersion evolution compatible with the 
expansion scenario is indicated (see § I5.4| |. 
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APPENDIX 

OVERVIEW OF OUR REFERENCE MODEL 

In recent years we developed a model of galaxy formation with focus on the evolution of baryons within protogalactic spheroids. 
Baryons have been followed through simple physical recipes emphasizing the effects of the collapse and cooling and of the energy 
fed back to the intragalactic gas by supernova (SN) explosions and by accretion onto the nuclear supermassive black holes (BHs; 
see Granato et al. 2001, 2004; Lapi et al. 2006, 2008; Mao et al. 2007; Fan et al. 2008). The main motivation was to enhghten 
the relevant physical processes shaping galaxy formation, to keep calculations easily reproducible and to suggest which processes 
should be implemented in the much more complex and much less reproducible numerical simulations. 

The model transparently shows how physical processes acting on the baryons speeds up the formation of more massive galaxies. 
As a result, although the DM assembly follows a bottom-up hierarchy, galaxies and their active nuclei evolve in a way that appears 
opposite to the hierarchy in DM, following a pattern that we named Antihierarchical Baryon Collapse (ABC). We notice that it 
fully corresponds from the observational point of view to the so called downsizing. 

We defer the interested reader to the above papers for a full account of the physical justification and a detailed description of the 
model, with appropriate acknowledgment of previous work. Here we present a short summary of its main features, and provide 
useful analytic approximations for quantities of relevance in this context. 

DM sector 

As for the treatment of the DM in galaxies, the model follows the standard hierarchical clustering framework, and takes into 
account the outcomes of recent intensive high-resolution A^-body simulations of halo formation in a cosmological context (see 
Zhao et al. 2003; Diemand et al. 2007; Hoffmann et al. 2007; Ascasibar & Gottloeber 2008). In these studies, two distinct 
phases in the growth of DM halos have been recognized: an early fast collapse, and a later slow accretion phase. During the early 
collapse, a substantial mass is gathered through major mergers, which effectively reconfigure the gravitational potential wells 
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and cause the collisionless DM particles to undergo dynamical relaxation and isotropization (Lapi & Cavaliere 2009). During 
the later phase, moderate amounts of mass, around 20 - 50%, are slowly accreted mainly onto the halo outskirts, little affecting 
the inner structure and potential, but quiescently rescaUng upward the overall halo size. From the baryon point of view, the early 
phase — our main interest here — supports the dissipationless collapse of baryons to originate a spheroidal structure dominated 
by random motions (see also Cook et al. 2009). 

Halos harboring a massive elliptical galaxy once created, even at high redshift, are rarely destroyed, while at low redshifts 
they are possibly incorporated within galaxy groups and clusters. Thus at z > 1, the halo formation rate can be reasonably well 
approximated by the positive term in the derivative of the halo mass function with respect to cosmic time (e.g., Haehnelt & Rees 
1993; Sasaki 1994). The halo mass function derived from numerical simulations (e.g., Jenkins et al. 2001) is well fit by the 
Sheth & Tormen (1999, 2002) formula, that improves over the original Press & Schechter (1974) expression (well known to 
under-predict by a large factor the massive halo abundance at high redshift). Adopting the Sheth & Tormen (1999) mass function 
NsriMujt), the formation rate of DM halos is given by 



d^A^sT 
dMndf 



aSc(t) _|_ 2p 



d(5,. 
"dT 



A?st(Mh,0 ; 



(Al) 



here a = Q.lQl and p = 0.3 are constants obtained by comparison with A^-body simulations, ct(Mh) is the mass variance of the 
primordial perturbation field computed from the Bardeen et al. (1986) power spectrum with correction for baryons (Sugiyama 
1995) and normalized to a% k, 0.8 on a scale of %h~^ Mpc, and 5c{t) is the critical threshold for collapse extrapolated from the 
linear perturbation theory. 

As for the density distribution of DM halos we adopt as a reference the profile proposed by Navarro et al. (1997) and charac- 
terized by a scale radius and by the ratio of the virial to the scale radius c = R}i/rs, the concentration parameter, with typical 
values around 4 at halo formation (e.g., Zhao et al. 2003). The halo circular velocity Vh = (GMh//?h)^^^ characterizes the DM 
potential well; the associated velocity dispersion is ctdm = /(c)^/^Vh, where /(c) w 2/3 + (c/21.5)*''' is a weak function of the 
concentration parameter of order 1 (see Mo et al. 1998). 



Baryonic sector 

During the fast collapse phase, a rapid sequence of major mergers build up a DM halo of mass Mh. At that time a mass 
Mini = fbMu of baryonic matter, in cosmic proportion fi, « 0.2 with the DM, is shock heated to the virial temperature by falling 
into the forming DM gravitational potential well. This hot gas cools and flows toward the central region at a rate 



^cond ~ 



Mm 

^cond 



(A2) 



over the condensation timescale ?cond = niax[?cooi(^H),?dyn(^H)], namely, the maximum between the dynamical and the cooling 
time at the halo virial radius Ru- When computing the cooling time, a clumping factor in the gas C > a few, as suggested by 
numerical simulations (e.g., Gnedin & Ostriker 1997; lliev et al. 2006), implies fcooi(^H) ^ fdyn(^H) on relevant galaxy scales at 

We recall that the star formation in galaxy halos is a quite inefficient process (see Fig. Al), since only a minor fraction 10-20% 
of the available baryons are cycled through stars in more massive halos and the fraction is rapidly decreasing with decreasing 
halo mass. As a result, the present-day cosmic mass density in stars is only a few percent of the mass density in baryons (e.g., 
Shankar et al. 2006). Thus the formation of the most massive galaxy in a halo is a process that involves only a fraction smaller 
than 20-30% of its original baryons and DM. It is natural to assume that the material is rapidly put together by a few major 
mergers in the central regions of the halo. In these mergers the angular momentum decays on a dynamical friction timescale 
/op w O.2(^/ln0/dyn, where ^ = Mu/M^ holds in terms of typical cloud mass Mc involved in major mergers (e.g. Mo & Mao 
2004); these are very frequent at high redshift and in the central regions of halos during the fast collapse phase of DM evolution, 
implying ^ ~ a few and hence a short top. Thus the effects of angular momentum can be neglected. 

The model also assumes that quasar (QSO) activity removes not only cold gas from the galaxy, but also hot gas from the halo 

through winds at a rate M^^, to be quantitatively discussed next; the equation for the diffuse hot gas is then 



(A3) 



The cold gas piled up by the cooling of hot gas, is partially consumed by star formation (M^,), and partially removed to a 
warm/hot phase endowed with long cooling time by the energy feedback from SNae (M^™^) and QSO activity (Aif^i^): 



cold ■ 



cond " 



■[l-7^(f)]M,-M; 



cold 



"^cold 



(A4) 



where TZ(t) is the fraction of gas restituted to the cold component by the evolved stars. It depends on time (particularly soon after 
the onset of star formation) and on the assumed initial mass function (IMF). We adopt for reference a pseudo-Chabrier IMF of 
shape ^(Wy,) = m'" with x = 1 .4 for 0. 1 ^m^, ^ 1 Mq and x = 2.35 for m^, > 1 Mq . The often used approximation of instantaneous 
recycling implies TZ » 0.54 (for a Salpeter IMF one has TZ « 0.3). The mass of cold baryons that is going to be accreted onto the 
central supermassive BH is smaU enough to be neglected in the above equation. 
Stars are formed at a rate 

= / -— « -— , (A5) 

J max[?cool,?dyn] 
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where now fcooi and t^yn refer to a mass shell dMcoid, and is the star formation timescale averaged over the mass distribution. 
Star formation promotes the gathering of some cool gas into a low-angular- momentum reservoir around the central supermassive 
BH. A viable mechanism for this process is the radiation drag (see discussion by Umemura 2001; Kawakatu & Umemura 2002; 
Kawakatu et al. 2003), which has the nice feature of predicting a mass transfer rate to the reservoir proportional to the SFR: 

Mnflow ~ ^ (1 - e-^n « ttRD X 10-3 (1 - ; (A6) 

the constant of proportionality aRo ~ 1 - 3 can be fixed to produce a good match to the correlation between the spheroid and the 
supermassive BH masses observed in the local Universe, while the quantity 



-2/3 



Z(t) Meoid(f) / Mh 

■ Zq 1O12M0 V1013Mq 

represents the effective optical depth of the gas clouds in terms of the normalization parameter ~ 1 — 2 (for more details, see 
the discussion around Eqs. [14] to [17] in Granato et al. 2004). 

Eventually, the gas stored in the reservoir accretes on to the BH powering the nuclear activity; usually, plenty of material is 
supplied to the BH, so that the latter can accrete close to the Eddington hmit 

M.=AEdd— ^ (A8) 

e 'Edd 

and grows almost exponentially from a seed of 10^ M0; the e-folding time involves the Eddington timescale fEdd ~ 4 x 10^ yr, 
the radiative efficiency e « . 1 , and the actual Eddington ratio AEdd ^0.3-3. The reservoir mass variation is ruled by the balance 
between the inflow due to radiation drag and the accretion onto the BH 

Mres=Minflow-M. , (A9) 

between the inflow due to radiation drag and the accretion onto the BH. 

The energy fed back to the gas by SN explosions and BH activity regulates the ongoing star formation and BH growth. The 
two feedback processes have very different dependencies on halo mass and on galaxy age. The feedback due to SN explosions 
removes the starforming gas at a rate 

MZi = PsNM., (AlO) 

where the efficiency of gas removal 

Ps. = ^^^^ « 0.6 ( ^^^^) (^) ( ^) ( C-P]' (All) 



^bind 
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depends on the number of SNae per unit solar mass of condensed stars Nsn (of order 1.4 x 10 ^ for the Chabrier IMF), on the 
energy per SN available to remove the cold gas EsnEsn, and on the specific binding energy of the gas within the DM halo, iibind- 
Following Zhao et al. (2003) and Mo & Mao (2004), the latter quantity has been estimated for z > 1 as Ebind = /(c) (1 -l-/b,i) /2 w 
3.2 x lO^\Mn/lO^^MQ)yHil+z)/4]cm^ s'^ 

In Granato et al. (2004) it is assumed that the QSO feedback acts both on the cold as well as on the infaUing gas, unbinding 
them from the DM halo potential well at a rate ^/^f coid = ^wmd x Minf,coid/(^mf +Mcoid) proportional to the corresponding mass 
fraction and to the wind mass outflow rate 



Lqso 



Mwind = , (A12) 

iibind 



with 



3/2 



Lqso « 2 x lO'^ cqso ( ^^3^ ) erg s'^ . (A13) 



Here egso = (/ft/0.5) (/e/0. 1)7^^23 is the strength of QSO feedback, expected to be close to unity; fh parameterizes the efficiency 
of energy transfer from winds generated close to the accretion disc to the general interstellar medium, fc is the covering factor of 
such winds and A'23 is the hydrogen column density toward the BH in units of 10^' cm"-^ (cf. Eqs. [29], [30] and [31] in Granato 
et al. 2004). With cqsq w 1.3 the bright end of the galaxy luminosity function is reproduced (Lapi et al. 2006). 

As a consequence, the QSO feedback grows exponentially during the early phases of galaxy evolution, following the exponen- 
tial growth of the supermassive BH mass. It is is negligible in the first 0.5 Gyr in all halos, but abruptly becomes dominant in 
DM halos more massive than IO^^Mq. Eventually, in these systems most of the gas becomes unbound from the potential weU 
of the galaxy halo, so that star formation and BH activity itself come to an end on a timescale which is shorter for more massive 
galaxies. 

Indeed, the positive feedback on BH growth caused by star formation, in cooperation with the immediate and negative feedback 
of SN, and the abrupt and dramatic effect of QSO feedback, are able to reverse the formation sequence of the baryonic component 
of galaxies compared to that of DM halos: the star formation and the buildup of central BHs are completed more rapidly in the 
more massive halos, thus accounting for the phenomenon now commonly referred to as downsizing. 
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Fig. 6. — Evolution with galactic age of the stellar, reservoir, BH masses (left axis scale) and of the SFR (right axis scale) in our reference model (Granato et 
al. 2004) for galaxy halos with mass lO" Mq and lO' ' Mq formed at zfoim = 3. 



The model yields as outputs the time evolutions of stars and gas (including metals) components of the galaxies and of the 
associated active galactic nuclei. When the star formation, gas abundance and the chemical evolution history of a galaxy within a 
DM halo of given mass have been computed, the dust abundance can be inferred (e.g., Mao et al. 2007). Then the spectral energy 
distribution as a function of time from extreme-UV to radio frequencies can be obtained through spectrophotometric codes, such 
as GRASIL, which includes a tretment of dust reprocessing (Silva et al. 1998; Schurer et al. 2009). Coupling these results with 
the DM halo formation rate, we can obtain the statistics of galaxies and supermassive BHs/QSOs as a function of cosmic time, 
in different observational bands. 

Analytic approximations 

By analyzing the results of the numerical solution for the full set of equations given above, it is apparent that in massive galaxies 
the term of QSO feedback is important only during the final stage of BH growth, around 2-3 e-folding times (approximately 
10^ yr) before the peak of QSO luminosity, when the energy discharged by the QSO is so powerful to unbind most of the residual 
gas, quenching both star formation and further accretion onto the supermassive BH. On the other hand, in less massive galaxies 
the central BH mass and the associated accretion is not able to stop star formation, which lasts for several Gyrs. The duration of 
the star formation A/burst can be approximated by a simple analytical form 

AW«6xlO«(ll£)"'V(^) yr, (A14) 

where J-{x) = 1 for x> I and J-'ix) = x"' for x < 1. A good approximation for the star formation history in massive galaxies is 
obtained by neglecting the QSO feedback effect in Eqs. (A3) and (A4) and by abruptly stopping star formation and accretion 
onto the central BH after Afbmst since halo formation. 
Then Eqs. (A3) and (A4) can be easily solved, with the outcome that the infalling mass declines exponentially as 

Mi„f(f)=MKie-'/'™^ (A15) 
where we assume Mb,i = fbMu, with fb = fib /^dm ~ 0.2. The cold gas mass evolves according to 
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here we have introduces the shorthand 7=1 -TZ+ /3sn- The quantity s = fcond/f* is the ratio between the timescale for the large- 
scale infall estimated at the virial radius and the star formation timescale in the central region; it corresponds to 5 w 5, both for 
an isothermal or NEW density profile with standard value of the concentration parameter c w 4 at halo formation. We notice 
that the dependence on the fraction of gas restituted by the stars is quite weak and that the value obtained by the hypothesis of 
instantaneous recycling can be used. 
The corresponding stellar mass reads: 



MJt)='^ 
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the mass of all stars formed during the main episode of star formation is then Ml w M*(Arburst)- 
In Eqs. (A13), (A14), and (A16) the condensation timescale is well approximated by 



/co„d-9xioM^I [j^] yr- (AI8) 
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The scaling with redshift and mass mainly reflects the behavior of the dynamical/cooling time, while a mild dependence on Mh 
is also related to the impact of the energy feedback from the QSO on the infaUing gas, which is stronger for more massive halos 

hosting more massive BH. 

A good approximation for /gas, i.e. the ratio between the stellar and gaseous mass immediately before the gas is swept away by 
the QSO feedback, can be obtained using Eqs. (A16) and (A17). The gas mass includes both the cold gas and the gas restituted 
by the stars. The latter is estimated as T^M* with TZ = 0.3, which corresponds to the gas returned by the stars about 0.1 Gyr after 
a burst of star formation with a Chabrier's IMF. We find that the results are well approximated by /gas ~ (M^/IO^M©)"^. 

For Mh ^ 3 x IO'^Mq, corresponding to > lO'^M©, the dependence on halo mass and formation redshift of ratio m 
between the halo mass and the surviving stellar mass (i.e. the present day stellar mass fraction) can be approximated as m ~ 
25 (Mh/IO'^Mq)" ' [(1 +Zform)/4]~'''^^. At lower masses m increases rapidly with decreasing M^, (see Shankar et al. 2006). 

Finally, considering that all the mass flowed into the reservoir is eventually accreted by the BH and neglecting the mass of the 
seed BH (A/2 ~ 10^ M©), one can write the relic BH mass as function of the overall mass in stars formed during the star forming 
phase A^burst as 

Aff « X 10-^Mf (1 -e-^^x^^) . (A19) 
The time average of the optical depth (trd) can be approximated as 

implying 1 -e"^"^"^ « 1 for massive galaxies and 1 -e~^^^ « (trd) « M^^^ for less massive galaxies. As expected, most of the 
mass flows from the reservoir to the central BH in the final couple of e-folding times. At early times the ratio of the BH to the 
stellar mass is predicted to be much lower than the final value (cf. Fig. Al). 

In Fig. Al we show, as illustrative examples, the evolutions with galaxy age of the stellar, reservoir and BH masses, and of the 
star formation rate (SFR) for galaxies with halo masses of lO'^M© and lO'^M©, formed at Zfom = 3. 



